Miminum Entropy Set Cover Problem for Lossy Data Compression
نویسندگان
چکیده
Classical minimum entropy set cover problem relies on the finding the most likely assignment between the set of observations and the given set of their types. The solution is described by such partition of data space which minimizes the entropy of the distribution of types. The problem finds a natural application in the machine learning, clustering and data classification. In this paper we show that it is closely related to lossy data compression. In particular, we prove that the minimum entropy set cover is a special case of specific generalized entropy coding. We establish the relation between the solution of these two problems. Moreover, we propose a simple greedy algorithm which approximates the entropy of our lossy data compression within an additive term of log2 e. The proof is based on the recent result obtained for minimum entropy set cover and our partition reduction theorem for lossy data compression.
منابع مشابه
Partition Reduction for Lossy Data Compression Problem
We consider the computational aspects of lossy data compression problem, where the compression error is determined by a cover of the data space. We propose an algorithm which reduces the number of partitions needed to find the entropy with respect to the compression error. In particular, we show that, in the case of finite cover, the entropy is attained on some partition. We give an algorithmic...
متن کاملEntropy Approximation in Lossy Source Coding Problem
In this paper, we investigate a lossy source coding problem, where an upper limit on the permitted distortion is defined for every dataset element. It can be seen as an alternative approach to rate distortion theory where a bound on the allowed average error is specified. In order to find the entropy, which gives a statistical length of source code compatible with a fixed distortion bound, a co...
متن کاملGfwx: Good, Fast Wavelet Codec Ict Tech Report Ict-tr-01-2016
Wavelet image compression is a popular paradigm for lossy and lossless image coding, and the wavelet transform, quantization, and entropy encoding steps are well studied. Efficient implementation is straightforward for the first two steps using e.g. lifting and uniform scalar deadzone quantization, but entropy encoding is typically carried out using complex context modeling and arithmetic codin...
متن کاملA Survey of Various Data Compression Techniques
This paper is a survey of various methods of data compression. When the computer age came about in the 1940’s, storage space became an issue. Data compression was the answer to that problem. The compression process takes an original data set and reduces its size by taking out unnecessary data. There are two main types of compression, lossy and lossless. This paper will deal exclusively with los...
متن کاملof PACS browser with the rest of the network Cluster Controller
Despite over a decade of research and development, medical image compression has not yet been widely implemented on clinical picture archiving and communication systems (PACS). We have developed a prototype interface which incorporates both lossless and lossy compression into a browsing system that enables the efficient use of network and storage resources. Such a system allows an user to quick...
متن کامل